Exploring the distribution of animacy: experiments on Norwegian

نویسنده

  • Lilja Øvrelid
چکیده

Animacy is a an inherent property of the referents of nouns which has been claimed to figure as an influencing factor in a range of different grammatical phenomena in various languages. In recent years several linguistic studies have examined the influence of argument animacy in grammatical phenomena such as differential object marking (Aissen, 2003), the passive construction (Dingare, 2001), the dative alternation (Bresnan et al., 2005) etc. A variety of languages are sensitive to the dimension of animacy in the expression and interpretation of core syntactic arguments (Lee, 2002; Øvrelid, 2004), either on a categorical level or as a strong statistical tendency. This talk will report on machine learning experiments aimed at automatically acquiring animacy information for common nouns in Norwegian (Øvrelid, 2006), which show that the animacy of a noun influences its linguistic distribution in such a consistent manner that an automatic classification based on distributional features is worthwhile. By exploiting the strong correlation between the animacy dimension and other linguistic dimensions, new knowledge about the semantic and distributional properties of various constructions may be obtained with very little manual effort. The experiments also raise the question of how the dimension of animacy may be conceptualised and delimited based on distributional evidence from large corpora. Machine learning experiments are to a large degree dependent on the set of features chosen to represent the data. A key generalisation or tendency observed in both traditional typological linguistics, as well as the more recent linguistic studies where animacy figures, is that prominent grammatical features tend to attract other prominent features (Aissen, 2003); subjects, for instance, will tend to be animate and agentive, whereas objects prototypically are inanimate and themes/patients. Exceptions to this generalisation express a more marked structure, a property which has consequences, for instance, for the distributional properties of the structure in question. For these experiments, a set of seven morphosyntactic features were selected, features which in various ways approximate the multi-faceted property of animacy. The seven features are presented in Table 1. In particular, these features exploit the strong correlation that animacy has to other linguistic dimensions, such as agentivity and discourse salience. The experimental methodology is inspired by experiments done on verb classification for intransitive verbs presented in Merlo and Stevenson (2001). For a set of forty highly frequent common nouns (20 animate, 20 inanimate), relative frequencies for the different morphosyntactic features described above were computed from the Oslo Corpus, a corpus of approximately 15 million words which has been automatically annotated with a Constraint Grammar tagger1. The mean relative frequencies for each class animate and inanimate are presented in the first two rows of Table 2. As we can see, quite a few of the features express morphosyntactic cues that are rather rare, and there is also quite a bit of variation in the data (represented by the standard deviation for each class-feature combination).

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards Robust Animacy Classification Using Morphosyntactic Distributional Features

This paper presents results from experiments in automatic classification of animacy for Norwegian nouns using decision-tree classifiers. The method makes use of relative frequency measures for linguistically motivated morphosyntactic features extracted from an automatically annotated corpus of Norwegian. The classifiers are evaluated using leave-oneout training and testing and the initial resul...

متن کامل

Exploring the Role of Occurring Errors Distribution in the Distribution of Corrective Feedback Targets

This study attempted to compare corrected linguistic errors in foreign language classrooms and all errors occurring in these classes to see which types of errors are more attended to by teachers in relation to their occurrence in classes. For this purpose, 69 hours of the classes of 34 teachers teaching in different language schools were recorded and the errors corrected by these teachers were ...

متن کامل

Light distribution in Chinese solar greenhouse and its effect on plant growth

Chinese solar greenhouse (CSG) is universally applied in northern China for producing horticultural products. CSG is characterized by the unbalanced structures with an arched front roof face to the south side and a thick wall as well as back roof in the north side. Such structures affect light distribution in the greenhouse. This study aims to investigate the light distribution properties in CS...

متن کامل

Rethinking the Theory of Change for Health in All Policies; Comment on “Health Promotion at Local Level in Norway: The Use of Public Health Coordinators and Health Overviews to Promote Fair Distribution Among Social Groups”

This commentary discusses the interesting and surprising findings by Hagen and colleagues, focusing on the role of the public health coordinator as a Health in All Policies (HiAP) tool. The original article finds a negative association between the employment of public health coordinators in Norwegian municipalities and consideration of a fair distribution of social and economic resources betwee...

متن کامل

A categorical recall strategy does not explain animacy effects in episodic memory.

Animate stimuli are better remembered than matched inanimate stimuli in free recall. Three experiments tested the hypothesis that animacy advantages are due to a more efficient use of a categorical retrieval cue. Experiment 1 developed an "embedded list" procedure that was designed to disrupt participants' ability to perceive category structure at encoding; a strong animacy effect remained. Exp...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006